{
 "cells": [
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "# 自然语言生成 \n",
    "`NLG(Natural Language Generation)`，生成新的文本   \n",
    "是很多任务的组成部分，如下：\n",
    "- 机器翻译\n",
    "- 生成式摘要\n",
    "- 对话，闲聊或任务\n",
    "- 写作，讲故事或生成诗歌\n",
    "- 问答，答案是生成的，而不是从文本或知识库中抽取的\n",
    "- 图片说明\n",
    "- ..."
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "## 语言模型\n",
    "`Language Modeling`：给定到当前的连续的单词序列，预测下一个单词的任务    \n",
    "$$P(y_t|y_1,...,y_{t-1})$$   \n",
    "能够产生上述概率分布的系统，称为 一个语言模型`(a language model)`   \n",
    "如果该系统是一个 `RNN`，则称为 `RNN-LM`"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": []
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": []
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "## 条件语言模型\n",
    "`Conditional Language Modeling`：给定到当前的连续的单词序列，\n",
    "以及其它的一些输入 $x$, 预测下一个单词的任务\n",
    "$$P(y_t|y_1,...,y_{t-1},x)$$   \n",
    "\n",
    "条件语言建模任务的示例：\n",
    "- 机器翻译：`(x=source sentence,y=target sentence)`\n",
    "- 摘要：`(x=input text,y=summarized text)`\n",
    "- 对话：`(x=dialogue history,y=next utterance)`"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": []
  },
  {
   "cell_type": "markdown",
   "metadata": {
    "ExecuteTime": {
     "end_time": "2020-05-01T15:50:39.476634Z",
     "start_time": "2020-05-01T15:50:39.468980Z"
    }
   },
   "source": [
    "## 训练基于递归神经网络的条件语言模型\n",
    "`Teacher Forcing`：训练时不管解码器的输出是什么，将目标序列送入解码器"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": []
  },
  {
   "cell_type": "markdown",
   "metadata": {
    "ExecuteTime": {
     "end_time": "2020-05-01T15:24:54.921851Z",
     "start_time": "2020-05-01T15:24:54.916240Z"
    }
   },
   "source": [
    "# 解码算法\n",
    "`decoding algorithms`\n",
    "- 贪心解码 greedy decoding\n",
    "- 束搜索\n",
    "- 采样方法\n",
    "\n",
    "- softmax temperature"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": []
  },
  {
   "cell_type": "markdown",
   "metadata": {
    "ExecuteTime": {
     "end_time": "2020-05-01T15:25:59.843727Z",
     "start_time": "2020-05-01T15:25:59.835859Z"
    }
   },
   "source": [
    "# 自然语言生任务\n",
    "`NLG task`"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": []
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "# 评估"
   ]
  }
 ],
 "metadata": {
  "kernelspec": {
   "display_name": "Python 3",
   "language": "python",
   "name": "python3"
  },
  "language_info": {
   "codemirror_mode": {
    "name": "ipython",
    "version": 3
   },
   "file_extension": ".py",
   "mimetype": "text/x-python",
   "name": "python",
   "nbconvert_exporter": "python",
   "pygments_lexer": "ipython3",
   "version": "3.7.4"
  },
  "toc": {
   "base_numbering": 1,
   "nav_menu": {},
   "number_sections": true,
   "sideBar": true,
   "skip_h1_title": false,
   "title_cell": "Table of Contents",
   "title_sidebar": "Contents",
   "toc_cell": false,
   "toc_position": {},
   "toc_section_display": true,
   "toc_window_display": false
  },
  "varInspector": {
   "cols": {
    "lenName": 16,
    "lenType": 16,
    "lenVar": 40
   },
   "kernels_config": {
    "python": {
     "delete_cmd_postfix": "",
     "delete_cmd_prefix": "del ",
     "library": "var_list.py",
     "varRefreshCmd": "print(var_dic_list())"
    },
    "r": {
     "delete_cmd_postfix": ") ",
     "delete_cmd_prefix": "rm(",
     "library": "var_list.r",
     "varRefreshCmd": "cat(var_dic_list()) "
    }
   },
   "types_to_exclude": [
    "module",
    "function",
    "builtin_function_or_method",
    "instance",
    "_Feature"
   ],
   "window_display": false
  }
 },
 "nbformat": 4,
 "nbformat_minor": 4
}
